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1. Introduction 

This report is concerned with the performance of 
software used in coordinate measurement systems 
(CMSs) to evaluated the geometric characteristics of 
manufactured parts. The particular performance char- 
acteristics of interest are those that impact the uncer- 
tainty of measurement results produced by CMSs. 

Inspection planners need quantitative measures of 
performance to develop uncertainty budgets and to eval- 
uate the quality of measurement results. In support of 
this need, NIST has recently established a Special Test 
Service, the Algorithm Testing and Evaluation Program 
for Coordinate Measurement Systems (ATEP-CMS), to 
measure the performance of geometric fitting software 
used in CMSs [1]. This report documents and explains 
the performance measures used in ATEP-CMS. 

Geometric fitting is the process of computing the 
representation parameters of a geometric element that in 
some sense best represents a set of point coordinate 
data. This representative geometric element is called the 



substitute geometry for the data points. A manufactured 
hole, for instance, is usually not perfectly cylindrical 
because the process that produced it can never be totally 
perfect. Inspection of the hole might involve measuring 
the coordinates of selected points on the surface of the 
hole, fitting a cylinder to the measured points to mini- 
mize the sum of squares of the orthogonal distances 
from the points to the cylinder, and comparing the posi- 
tion, orientation, and size of the fitted cylinder to the 
dimensions and tolerances of the part specification. 

Many factors affect the accuracy of inspection proce- 
dures.^ This report focuses on one of these factors: how 
close the computed fit is to the intended, mathematically 
defined, substitute geometry. The performance 



^ ATEP-CMS — part of a larger field of endeavor called computational 
metrology [2] — addresses the fitting software as a source of uncer- 
tainty due to possible computational errors. It does not address the 
propagation of point coordinate uncertainty through the geometric 
fitting computations, how well the points represent the surface, or 
whether the intended software function is the most appropriate for the 
task. 
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measures used in ATEP-CMS quantify how well the 
software computes substitute geometries over a range of 
inspection problems. 

The next section provides background information on 
the need for algorithm testing, how ATEP-CMS works, 
and criteria for performance measures. Section 3 pre- 
sents details of the test methods used in ATEP-CMS for 
various types of geometric elements. Section 4 discusses 
a statistical interpretation of algorithm performance and 
explains the method used in ATEP-CMS to summarize 
and interpret the test results. Section 5 presents an un- 
certainty analysis of ATEP-CMS results. 

2. Background 

The drive to develop methods for testing CMS 
algorithms began in Germany in the early 1980s, when 
it became apparent that the software supplied with com- 
mercial coordinate measurement machines (CMMs) 
varied widely in quality [3, 4]. In the United States, the 
American Society of Mechanical Engineers established 
a task force on CMM software that is now designated 
B89.4.10. Work intensified in 1988 following wide- 
spread notification within the defense industries of 
serious problems with certain commercial CMM 
software [5]. A U.S. standard on performance evaluation 
of CMS software is now in draft form and is expected to 
be issued for public review in early 1996 [6]. NIST's 
ATEP-CMS service [1] supports this emerging 
standard. 

2.1 What is Evaluated by ATEP 

When discussing software, the term "testing" is used 
in many different ways, and it is important to understand 
the sense in which it is used within ATEP-CMS. It is 
specifically not used in the sense of software engineer- 
ing, where software testing is aimed at supporting de- 
bugging or maintenance and typically involves code 
structure analysis, walk-throughs, and similar activities. 
Within ATEP-CMS, testing is strictly hmited to black- 
box testing of how the actual behavior of the software 
compares with its intended behavior. Moreover, ATEP- 
CMS restricts itself to measuring the accuracy of re- 
ported results. Other performance characteristics — such 
as memory requirements, computing time, ease of use, 
and other factors — are not considered. In sum, ATEP- 
CMS ignores the fact that the software under test is 
software; the ATEP-CMS process would be unchanged 
if the fitting methods were implemented with an analog 
computer, as a digital filter circuit, or even mechanically 

m. 



ATEP-CMS does not address whether the intended 
behavior of the software is appropriate for a particular 
measurement task. That issue is not one of performance, 
but of whether a particular data analysis method is the 
right tool for the job. Such considerations are beyond the 
scope of a performance testing program. 

ATEP-CMS currently supports testing of orthogonal 
distance regression fitting^ for seven geometry types: 
line, circle, plane, sphere, cylinder, cone, and torus. 
(Lines and circles are fit to three-dimensional point co- 
ordinates which need not be coplanar.) The reader is 
referred to Ref. [9] for information on how to use 
ATEM-CMS and the procedures used by NIST for con- 
ducting a test. Section 3 below describes the specific 
difference parameters used in ATEP-CMS for each 
geometry type. 

2,2 Criteria for Performance Measures 

In developing performance measures, the question 
naturally arises as to how one might choose one measure 
over another. One approach is to seek a reliable method 
of certifying software as "good" or "bad." (This is the 
approach taken, for instance, by the German testing 
program [3].) 

The approach taken in ATEP-CMS is somewhat dif- 
ferent. We start with the observation that software is not 
mathematically perfect. (If nothing else, the end results 
must be represented in finite precision.) So rather than 
developing a pass/fail criterion, ATEP-CMS quantifies 
the expected performance numerically. The goal is to 
develop numerical measures of performance that 
provide to a user of the software, information necessary 
to decide whether the software is adequate for a partic- 
ular application. To this end, we have used the following 
criteria in developing ATEP-CMS performance mea- 
sures: 

• each measure should be directly related to inspection 
tasks; 

• each measure should combine like a standard devia- 
tion with other sources of uncertainty in a CMS ; 



^ Orthogonal distance regression is commonly called "least squares" 
by manufacturing practitioners. Statisticians consider these two terms 
to be quite different. Lease squares measures deviations from the 
surface along a particular direction in a coordinate system. Orthogonal 
distance regression (which is also called "errors in variables" regres- 
sion or, for linear fits, "total least squares") measures deviations per- 
pendicular to the surface. Unfortunately, some of the literature (e.g., 
Ref. [8]) further confuses the terminology by using the term 
"orthogonal distance regression" for one specific algorithm that solves 
the orthogonal distance regression problem. 
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• each measure should have reasonable probability and 
coverage interpretations; 

• an estimate of uncertainty should be derivable for 
each measure. 

The first three criteria directly support the goal of using 
the performance measures to establish uncertainty bud- 
gets for CMSs. The last criterion reflects the view that 
an assessment of software uncertainty is itself a mea- 
surement. By NIST [10] and international [11] guideli- 
nes, a quantitative statement of uncertainty should be 
part of any complete measurement result. The last crite- 
rion says that the ATEP-CMS performance measures 
must be amenable to such a treatment. 



3. ATEP-CMS Test Methods 

To Start a software test, NIST generates a collection of 
data sets for each geometry type, consisting of three- 
dimensional point coordinates together with some book- 
keeping information. NIST also generates a fit for each 
data set, called the reference fit for that data set. Within 
ATEP-CMS, the reference fits are generated by fitting 
the data sets using NIST-developed fitting algorithms.^ 
(It is also possible to start with the desired fits and 
generate data sets having those fits as answers [13].) 

The data sets are then sent to the software under test, 
which generates a test fit for each data set. The test fits, 
represented by a set of parameter values in a standard 
format, are returned to NIST for evaluation with respect 
to the corresponding reference fits. 

This section describes how differences between a test 
fit and the corresponding reference fit are evaluated in 
ATEP-CMS. Differences between each pair of fits are 
represented by a set of difference parameters. Each ge- 
ometry type gives rise to its own set of difference 
parameters. The relationship between the difference 
parameters and tolerance applications is also discussed. 

3.1 General Procedure for a Single Fit 

Methods for comparing one fit to a reference are 
described in the draft B89.4. 10 standard [6]. The proce- 
dure for all geometry types follows a common pattern. 

First, the test fit is bounded by projecting the data 
points onto the geometry. (The way this is done depends 



^ A small disclaimer is in order here. The use of the term "reference 
fit" is not meant to imply that the ATEP-CMS fitting algorithms are 
"standard algorithms" for geometric fitting problems. For instance, 
the algorithm used in ATEP-CMS for nonlinear orthogonal distance 
regression (see Sec. 5.1.3) was chosen for its efficient use of memory, 
but its use of normal equations in the iteration are a known weakness. 
Algorithms with superior numerical properties are widely available 
(see, e.g., Refs. [12, 8]) and may be used by ATEP-CMS in the future. 



on the geometry type, and will be described below. 
Also, bounding does not apply to circles, spheres, or 
tori, which are naturally bounded.) This is done because 
tolerancing standards (e.g., Ref. [14]) specify that toler- 
ances are to be evaluated over the full extent of the 
associated geometric features. ATEP-CMS assumes 
that the data points represent the features. 

Once the test fit is bounded, geometric differences 
between the test fit and the reference fit are determined. 
Again, the specifics depend on the geometry type. In 
general, however, the differences are designed either to 
directly reflect a tolerancing application or to provide 
diagnostic information about the software under test. 
The difference parameters depend only on the geometry 
represented by the fit parameters, and not on the 
parameterization of that geometry. 

Finally we note that in many cases the difference 
parameters are not symmetric. This is a result of the 
asymmetric treatment of the test and reference fit (the 
test fit is bounded; the reference fit is not). Thus, revers- 
ing the roles of test and reference fit will generally 
change the difference parameter values. 

3.2 Application to Specific Geometry Types 

The remainder of this section discusses the applica- 
tion of the general procedure to the seven geometry 
types currently supported by ATEP-CMS: lines, circles, 
planes, spheres, cylinders, cones, and tori. For each 
geometry type, we describe the fitting objective, typical 
uses for such fits in inspection, and the difference 
parameters computed within ATEP-CMS. 

3.2,1 Lines Line fitting minimizes the sum of 
squares of orthogonal distances from the points to the fit 
line. Line fitting is commonly used to check straightness 
and to establish an axis from the centers of circular cross 
sections. In the latter case, the axis may be checked for 
location or orientation or used as a coordinate datum 
axis. 

Within ATEP-CMS, two parameters representing dif- 
ferences between the lines are computed. (Henceforth, 
we will call such parameters, difference parameters. ) 
One difference parameter is the angle between the test 
and reference lines. This directly relates to the use of the 
fit as a datum axis. To define the other parameter, the 
test fit line is first trimmed to a segment by the orthog- 
onal projection of the data points. Then the second 
parameter is the maximum orthogonal distance from the 
segment endpoints to the reference line. This is a 
measure of separation between the lines and directly 
relates to the assessment of straightness, position, 
and orientation. Both parameters are inherently non- 
negative quantities. (Henceforth, quantities that are in- 
herently nonnegative will be referred to as magnitudes. ) 
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3.2.2 Circles Circle fitting is a two-step process. 
First, the plane of a circle is found by fitting a plane to 
the data (as described below). Then, the points are pro- 
jected into the plane and an orthogonal distance regres- 
sion circle is fit to the projected points. Note that this 
process generally results in a different circle than would 
be found by using the three-dimensional distances from 
the points to the circle. 

Circle fitting is commonly used to check roundness, 
runout, cross-sectional size, the straightness of the me- 
dian line of a hole or shaft, and sometimes position. It is 
also sometimes used as one step in finding an axis. The 
purpose of the two-step fitting process is that, in prac- 
tice, the deviations of the points from a plane usually 
result from measurement error only, while the deviations 
from a circle in the plane also include the effects of form 
deviation of the part. 

Within ATEP-CMS, three difference parameters are 
computed. One is the angle between the planes of the 
test and reference circles. This is primarily a diagnostic 
measure"^ that can explain other differences between the 
fits. The second parameter is the Euclidean distance 
between the circle centers, measured in three dimen- 
sions. This relates directly to assessment of roundness, 
position, and establishing an axis. The third parameter is 
the difference between the test and reference circle radii. 
This relates directly to the assessment of size. The first 
two parameters are magnitudes, while the last parameter 
is a signed quantity, and is negative whenever the test fit 
radius is smaller than the reference fit radius. 

3.2.3 Planes Plane fitting minimizes the sum of 
squares of orthogonal distances from the points to the fit 
plane. Plane fitting is commonly used to establish a 
coordinate datum plane and to check flatness, position, 
and orientation. 

Within ATEP-CMS, two difference parameters are 
computed. One parameter is the angle between the test 
and reference planes. This directly relates to the use of 
the fit as a datum plane. To define the second parameter, 
the data points are orthogonally projected onto the test 
fit plane. The convex hull of the projected points in the 
plane forms a planar patch. The second parameter is the 
maximum separation of the planar patch measured or- 
thogonally from the reference plane. This measure of 
separation directly relates to the assessment of flatness, 
position, and orientation. Both parameters are magni- 
tudes. 

3.2.4 Spheres Sphere fitting minimizes the sum 
of squares of orthogonal distances from the points to the 
fit sphere. Sphere fitting is commonly used to check 



For instance, it can be used to detect if sometliing other tlian tlie 
two-step fitting process was used by the software under test. 



form, size, and location, and occasionally to locate the 
origin of a coordinate system. 

Within ATEP-CMS, two difference parameters are 
computed. The first parameter is the Euclidean distance 
between the sphere centers. This relates directly to as- 
sessment of form, position, and establishing an origin. 
The second parameter is the difference between the test 
and reference sphere radii. This relates directly to the 
assessment of size. The first parameter is a magnitude, 
while the second is a signed quantity, and is negative 
whenever the test fit radius is smaller than the reference 
fit radius. 

3.2.5 Cylinders Cylinder fitting minimizes the 
sum of squares of orthogonal distances from the points 
to the fit cylinder. Cylinder fitting is commonly used to 
check cylindricity and size, and to establish an axis. In 
the latter case, the axis is used to check position and 
orientation and to establish a coordinate datum. 

Within ATEP-CMS, three difference parameters are 
computed. One parameter is the angle between the test 
and reference cylinder axes. This directly relates to the 
use of the fit as a datum axis. To define the second 
parameter, the test fit axis is first trimmed to a segment 
by the perpendicular projection of the data points. Then 
the second parameter is the maximum perpendicular 
distance from the segment endpoints to the reference 
cylinder axis. This is a measure of separation between 
the axes and directly relates to the assessment of cylin- 
dricity, position, and orientation. The third parameter is 
the difference between the test and reference cylinder 
radii. This relates directly to the assessment of size. The 
first two parameters are magnitudes, while the third is a 
signed quantity, and is negative whenever the test fit 
radius is smaller than the reference fit radius. 

3.2.6 Cones Cone fitting minimizes the sum of 
squares of orthogonal distances from the points to the 
cone surface. Cone fitting is commonly used to check 
profile and conical taper. 

Within ATEP-CMS, four difference parameters are 
computed. One is the angle between the test and refer- 
ence cone axes. This directly relates to the use of the fit 
for checking profile. To define the next parameter, the 
test fit axis is first trimmed to a segment by first project- 
ing the data points perpendicularly onto the cone surface 
and then orthogonally projecting the projections onto 
the cone axis. Then the second parameter is the maxi- 
mum perpendicular distance from the segment end- 
points to the reference cone axis. This is a measure of 
separation between the axes and relates to the assess- 
ment of profile. To define the third parameter, the refer- 
ence cone axis is bounded using the same process as was 
used for the test fit. For each fit, the perpendicular 
distance from the midpoint of the axis segment to the 
corresponding cone surface is a measure of location of 
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the cone along its axis. The third parameter is the differ- 
ence between the location of the test fit cone and the 
location of the reference fit cone along their respective 
axes. This is primarily diagnostic, and relates to the 
assessment of profile. The last (fourth) parameter is the 
difference between the test and reference cone half-an- 
gles. This relates directly to the assessment of conical 
taper. The first two parameters are magnitudes, while 
the last two are signed quantities, and are negative 
whenever the test fit cone parameter is smaller than that 
of the reference fit. 

3.2,7 Tori Torus fitting minimizes the sum of 
squares of perpendicular distances from the points to the 
torus surface. Torus fitting is commonly used to check 
major and minor radii and profile. Occasionally torus 
fitting is used to establish a datum axis or plane. 

Within ATEP-CMS, four difference parameters are 
computed. One is the angle between the test and refer- 
ence torus planes. This directly relates to the use of the 
fit for checking profile and as a datum plane or axis. The 
second is the Euclidean distance between the torus cen- 
ters. This also relates to the assessment of profile. The 
third and fourth parameters are the difference between 
the test and reference fit major and minor radii. These 
relate directly to the assessment of major and minor 
radius tolerances. The first two parameters are magni- 
tudes, while the last two are signed quantities, and are 
negative whenever the test fit torus parameter is smaller 
than that of the reference fit. 

4. Data Interpretation 

The previous section described how each data set 
generates a set of difference parameters representing the 
differences between the test fit to the data set and the 
corresponding reference fit. In this section we discuss 
how these difference parameters are summarized into 
performance measures. For each difference parameter, 
we will define an associated performance measure sum- 
marizing the difference parameter values for all the data 
sets. 

We consider the set of difference parameter values 
associated with each data set to be one (random) sample 
of software performance from a theoretical underlying 
population of inspection problems. We will use a statis- 
tical approach to the interpretation of the observed sam- 
ple values. In Sec. 4.1 we derive the population charac- 
teristic we wish to use as a performance measure. In 
Sec. 4.2 we derive a practical estimator for that measure. 

4.1 Statistical Model of Algorithm Behavior 

We deal with each geometry type separately, and 
consider each performance measure (and associated 
difference parameters) independently. For simplicity. 



consider a single performance parameter for a given 
geometry type, as described in Sec. 3. Each data set 
represents an inspection problem drawn at random from 
a theoretical underlying population of inspection prob- 
lems. We call the substitute geometry defined by the 
mathematical objective function the true fit to the data 
set. We assume the true fit cannot be computed exactly. 
For the i\h data set, we now define three difference 
parameter values: 

ti — the difference between the test fit and the true 
fit to the data set; ti is unknown and unobserv- 
able. 

Ti — the difference between the reference fit and 
the true fit to the data set; r, is unobservable, 
but it can be bounded using numerical analysis 
theory (see Sec. 5). 

Pi — the difference between the test fit and the 
reference fit to the data set; pi is the calculated 
difference parameter for the i\h data set. 

These quantities are random variables since they are 
functions of the data set, which is a random sample. As 
random variables, we can define various expectations. 
For the test fit quantities ti we define the mean 



Mr = E{ti) ; 



the standard deviation 



and the root-mean square error (rmse) 



y, = VE{fi). 

where E{) denotes the average over all the data sets — 
that is, the expected value taken over the theoretical 
population of inspection problems. We define analogous 
quantities for r, : ^Ir, <Tr, and y^\ and for pi : ^tp, <Tp, and 

Tp- 

If the ti are normally distributed, then one can make 
statements of coverage like 



Pr {U,-^Tl^2a-T 



0.95. 



(1) 



This approach has two shortcomings. First, two parame- 
ters, ^tT and <Tt, are needed to summarize the test; for 
simplicity, we would like a single measure of perfor- 
mance. Second, it is difficult to estimate ixt and (Tt, 
because we do not know the true fit. 
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If fjij is less than half of a^, then the rmse provides 
similar coverage.^ 

Pr { U, I < 2yT : AtT < ^t/2} ^ 0.95. 

If ^tr > o-j/l, the actual coverage is greater (the software 
performance is better) than that suggested by a measure 
based on Jt. 

The above coverage interpretations are valid for nor- 
mal distributions. We believe that for the typical distri- 
butions that will arise, the coverage error will be on the 
conservative side since the tails are likely to be smaller 
than those of a normal distribution.^ 

ATEP-CMS uses jj as the theoretical performance 
measure of the software under test. Using the rmse 
overcomes one shortcoming of Eq. ( 1 ) — we have 
reduced the performance indicator to a single number. 
We have also replaced the problem of estimating /^t and 
cTp with the problem of estimating jj. We consider this 
next. 

4.2 Definition of Practical Performance Measures 

In this section, we will describe two methods of esti- 
mating Jt. The first follows directly from the definition 
of yx- The second is a simplified estimate. ATEP-CMS 
uses the second method, for reasons discussed at the end 
of this section. 

We observe from the definitions that ti=pi + ri for 
signed parameters, and U ^ pi -\- rt for magnitudes. Then 
a simple substitution yields 

^E{p,^) + E{n^) + 2E{p,rd 

^ yp^ + yR^ + 2yp yR. 

Thus, we can estimate an upper bound for yx if we can 
estimate yR and yp. The upper bound can then be used 
as an estimate for yr- 

To estimate yR, we need an estimate of the difference 
of each reference fit from the true fit. Methods for doing 
this will be discussed in detail in Sec. 5. For now, 
assume that the value of rt can be estimated as M/. Then 
an estimate^ yR of yR is the positive square root of 



Tr 



1 " 



' See, e.g., Ref. [15, Chap. 2]. 

^ This assumption will be tested using data collected during future 

operation of ATEP-CMS. 

^ Throughout this paper, a carat appearing over a quantity denotes an 

estimated value. 



where the sum is over all n data sets. (In this paper, all 
summations are from i= \ io n unless otherwise indi- 
cated. Henceforth, we will use the summation sign alone 
with the index variable and limits understood.) Similarly, 
we estimate yp by the positive square root of 



yp 



= \^p^' 



This first estimation method includes a term in the 
performance measure that represents the estimated dif- 
ference between the reference and true fits. 

A second method of estimating yx is simply yp = yp, 
estimating yp as before. This method uses the reference 
fit as an estimate of the true fit. Unlike the first ap- 
proach, the differences between the reference and true 
fits are not incorporated in the performance measure of 
the software under test. However, they will contribute to 
the uncertainty of the performance measures (discussed 
in Sec. 5). 

An argument in support of the first method of estima- 
tion is that the second, simpler method may underesti- 
mate the true quantity. If the performance measures are 
to be used to establish uncertainty budgets for a CMS, 
use of the second method makes it critical to analyze the 
sensitivity of the budget to the uncertainties of the 
ATEP-CMS results. The first method may have smaller 
uncertainties. 

On the other hand, there are three reasons why the 
simpler estimate is preferable. First, the software under 
test should not be penalized simply because the refer- 
ence software has uncertain performance. Second, the 
first method may overemphasize the role of the refer- 
ence software, and may be unduly pessimistic (since it is 
an upper bound). Finally, a sensitivity study is a proper 
precaution, and should be recommended anyway. 

For these reasons, ATEP-CMS uses the second 
method — the observed sample rmse — to estimate the 
performance of the software under test. That is, for each 
performance measure in ATEP-CMS, 



fT = yp=y^E 



5. Uncertainty Analysis 

This section addresses the uncertainty of the ATEP- 
CMS estimates of software performance. Recall that we 
have defined different performance measures for each 
geometry type, and that each performance measure is 
estimated from the corresponding set of difference 
parameters for the fits. We will show in this section that 
the uncertainty of each estimate arises mainly from two 
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sources: (1) the uncertainty of the individual difference 
parameters due to the uncertainty of the reference fit 
and (2) the uncertainty arising from having sampled the 
population of inspection problems. 

In Sec. 5.1 we will first examine the uncertainties of 
the reference fits themselves. In Sec. 5.2 we then exam- 
ine how these uncertainties propagate to the difference 
parameters. Finally, in Sec. 5.3 we discuss how the 
difference parameter uncertainties combine with the 
sampling uncertainty to yield an overall uncertainty for 
each performance measure. 



hold for the y and z coordinates. 

To conform to NIST policy on uncertainty statements, 
this must be converted to an uncertainty by assuming 
some distribution on the possible values of 5,. We will 
assume that the 8i are independent and uniformly 
distributed within the bounds [18]. Then the /th error 
term Xi8i is a random variable with mean of zero and 
standard deviation 2lx/le8o/V3. So, by the Central 
Limit Theorem [19], the x coordinate of the computed 
centroid of m data points is approximately normally 
distributed with mean Xq and standard deviation of 



5,1 Uncertainty of Reference Fits 

The uncertainties of the difference parameters for a 
single test fit arise primarily from the uncertainty due to 
the presence of (unknown) numerical errors in the refer- 
ence fit.^ That is, the reference fit may not correspond 
exactly to the true fit. A detailed discussion of the NIST 
fitting algorithms and their uncertainties will be pre- 
sented in a subsequent report. Here, we provide a sum- 
mary of the pertinent results. 

Our approach is to develop bounds for the numerical 
errors, assume the errors are uniformly distributed be- 
tween the bounds, and combine the standard deviations 
of the errors into the uncertainty of the reference fits. 
The results presented in this section and the next are 
specific to the particular algorithms currently used in 
ATEP-CMS. It is possible that the algorithms may 
change in the future [16], in which case these results 
would no longer hold. (The performance measures 
would not change, however.) 

5.1.1 Centroid Coordinates For all fits, the data 
are first translated to be centered at the origin. One 
source of uncertainty is the possible rounding error in 
computing the centroid. The ATS algorithms use the 
Kahan summation algorithm [17] with a C-language 
long double accumulator (80 bit EEEE extended real 
format). The error in the coordinates of the computed 
centroid can be bounded in terms of the unit roundoff 
^80 for the accumulator (about 1.084X10'^^). For in- 
stance, if Xc is the computed x coordinate of the centroid 
(the average of the coordinates X/ , / = 1 , . . . , m ) and Xo 
is the true coordinate, then the following result holds: 

^ m m 

x, = - 2^Ki + 50 + 0(10-^^) 2 i^.i 



m 



1.25X10-''V2x? 



Similar results hold for the y and z coordinates. For 
simplicity, the uncertainty of the centroid is modeled in 
ATEP-CMS by an isotropic, trivariate normal distribu- 
tion using a standard deviation of 



Uo= 1.25X10'^^ ^max/m. 



where 



■ ^0 



+ — ^ Xi 5i, 



5.1.2 Linear Fits (Lines and Planes) For lines 
and planes, orthogonal distance regression can be for- 
mulated as a linear algebra problem. Within ATEP- 
CMS, the reference fits are computed using the singular 
value decomposition of the data matrix after translating 
the centroid to the origin. The properties of the singular 
value decomposition are well understood, and the mag- 
nitude of the error in the direction vector can be bounded 
using standard techniques [20] as follows. 

We consider first the case of line fits. The direction of 
the line is the right eigenvector corresponding to the 
largest singular value of the data matrix. The singular 
value decomposition is computed using C-language 
double (64 bit IEEE real format) arithmetic, for which 
the unit roundoff is ^64 ^ 2.22X10'^^ Call the data 
matrix D and denote the exact solution by Q\ It can be 
shown that the computed solution, g, is (within sed the 
exact solution to a matrix D + E where IIEII2 ^ ^64 H^Hz- 
(Here, || • II2 denotes the vector or matrix 2-norm.) From 
this, it can be shown that [20], as long as the largest 
singular value is well separated^ from the next largest 
singular value in comparison to IIEII2, the error in the 
computed solution is bounded by 



where I 5/ I ^ 2^80 ~ 2.17 X 10 ^^ and in the second line 
we have ignored the term in 0(10'^^). Similar results 



Other errors, such as floating-point roundoff during evaluation of the 
difference parameters, are considered small enough to be ignored. 



\\Q-Q% ^ As,, 



I A/-A2' I 



(for lines). 



^ The notion of "well separated" can be made precise. Essentially, it 
means that the data has one dominant direction of scatter (so a line fit 
makes sense). 
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where Ai is the largest singular value and A2 is the 
next-largest singular value. (This bound is not the tight- 
est possible. For instance, as A2— >Ai, the bound can far 
exceed the theoretical worst-case difference between Q 
and Q* of 2. In such situations, however, the singular 
values are not well separated. Within ATEP-CMS, the 
error bound is never allowed to exceed 2.) 

Similar results can be obtained for plane fits. In this 
case, however, the normal to the plane is the right eigen- 
vector corresponding to the smallest singular value, A3. 
The corresponding error bound is 



uncertainty that is bounded by the convergence factor 
setting. Second, the solution to the normal equations at 
the terminating iteration is subject to numerical round- 
off effects. 

The convergence factor introduces an uncertainty in 
each element of the computed solution. If the calculated 
solution is a A: -dimensional vector F, if F* is the exact 
solution, and if C is the convergence factor, then the 
resulting error due to C alone is bounded by 



F-FH 



cVk. 



WQ-Q' 



4^64 



-Ail 



(for planes). 



We have here developed overall bounds for the errors 
in the eigenvectors. To convert these to standard uncer- 
tainties, we assume the errors are uniformly distributed 
within the bounds. The resulting standard deviation, 
denoted Md, is given by 



Wd = 



4 £64 A? 



V3IA^ 

4g64Al 

VsiAi-. 



(lines) 
(planes). 



These results are quite conservative. In general any one 
eigenvector will be more sensitive in one direction than 
another to perturbations in the data. This behavior could, 
in principle, be used to develop tighter bounds on the 
numerical errors propagated through the difference 
computations. Such an analysis will be the subject of 
future work. 

5.1,3 Nonlinear fits For geometries other than 
line and plane (circle, sphere, cylinder, cone, and torus), 
orthogonal distance regression is a nonlinear least- 
squares optimization problem. (Circle fitting uses plane 
fitting to reduce the problem to two dimensions.) The 
algorithm currently used in ATEP-CMS to compute the 
reference fits is a modified Levenberg-Marquardt 
algorithm proposed by Nash [21]. 

The ATEP-CMS algorithm starts with an initial guess 
and iteratively solves the normal equations for a linear 
approximation to the nonlinear objective function, 
where the solution is constrained to lie within a "trust 
region" of the current best guess. The iterations termi- 
nate when the computed solution to the normal equa- 
tions changes the solution by less than a convergence 
factor set by the tester. (Typically, the convergence 
factor used in ATEP-CMS is 10"^^) 

The uncertainty of the reference fit is a combination 
of two factors. First, the termination logic introduces an 



We next deal with the numerical inaccuracies. The 
fits are calculated using the Choleski decomposition of 
the normal equations in C-language double arithmetic, 
so the error in the solution due to numerical roundoff 
effects alone can be bounded by 



F-FH 



£64 k{A) II F I 



where k(A) is the condition number of the normal 
equation matrix A used for the last iteration of the 
Levenberg-Marquardt routine, and £^4 ~ 2.22X10^'^ as 
before [20]. 

To obtain an overall uncertainty, we must convert 
these bounds into standard deviations. We assume that 
the numerical errors and the convergence factor errors 
are independent and uniformly distributed between the 
bounds. Denote by u^ the standard uncertainty for non- 
linear fits. Then 



Up 



-W' 



44k'(A)\\ 



As with linear fits, the bounds developed here bracket 
the overall behavior of the fit parameter vector. But as 
with the singular value decomposition, the solution to 
the nonlinear problem will in general exhibit greater 
sensitivity and correlations for some parameters than for 
others. Thus, it may be possible to obtain tighter uncer- 
tainty bounds than those developed here. One problem is 
that, for geometries other than the sphere, the parame- 
terization used for fitting must somehow represent the 
orientation of the element. Unfortunately, there is no 
parameterization of orientations that is free of singular- 
ities, and there is always the possibility that the Jacobian 
of the residuals is rank deficient. This makes traditional 
approaches using, for instance, the condition number of 
the asymptotic covariance matrix at the computed solu- 
tion problematic [22]. As with the singular value de- 
composition, future work will address tightening the 
bounds on the fit uncertainties. 
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5.2 Propagation to Difference Parameters 

This section discusses how the uncertainties of the 
computed reference fits propagate to the difference 
parameters for individual fits. As with Sec. 5.1, the 
results of this section are specific to the particular 
algorithms currently being used within ATEP-CMS. 

The previous section identified several sources of 
standard uncertainty for the reference fits. We have 
designated these as follows: 

Wo — the standard deviation of the centroid coordinates 

Wd — the standard deviation of the error in direction for 
line and plane fits 

Wf — the standard deviation of the error norm of the 
representation parameter vector for nonlinear fits. 

We will separately propagate the standard uncertain- 
ties represented by each of these standard deviations 
through the comparison algorithms and combine the 
resulting standard uncertainties. 

The calculations are fairly straightforward, but 
tedious. Therefore, we will show details only for the 
case of the uncertainties of line fits. For other geometry 
types, we will just present the final results, since the 
method is the same. 

5.2.1 Line Line differences are defined by two 
characteristics: the angle between the test and reference 
lines, and the distance of an endpoint of the test line 
segment to the reference line. For the angle, we proceed 
as follows. As in Sec. 5. 1 .2, let Q be the direction of the 
computed reference line and Q * the direction of the true 
reference line. Also, let Qj be the direction vector of the 
test fit, a be the angle between Q and Qj, and a* be the 
angle between Qt and Q *. Since all the direction vectors 
are unit vectors, the sine of the angle between any two 
of the vectors is the magnitude of their vector cross- 
product. We then have: 

sin(Q;) = \\Qr X Q\\2 

= ||(3rX(0* + (0-0*))||, 

= ||(3rX <3* + (3rX(0-0*))||, 

< II (3r X 0* II, + II Or X {Q-Q^))\\2 

< sin (a*) + II Q-Q%. 

If we start with sin(Q;*) = ||OrX Q*||2 and follow a sim- 
ilar sequence, we find that sin(Q;*) ^ sin(Q;)+ ||Q-Q*||2. 
Assuming that a and a* are small (i.e., that the 



test and reference results are close), sin(Q;)~Q; and 
sin(Q;*) ~ a*. Thus, la-a*! ^ ||(?-(?*||2- Using the 
results of Sec. 5.1.2, ||(?-(?*||2 is the random variable 
corresponding to a standard deviation of Wd. We then 
have 



Md. 



We now consider the distance from any point b to the 
reference line. Call this distance d. The reference line 
is located by the centroid of the data, qo, so 
d = ||(^-^o)Xg||2. Two components of the fit uncer- 
tainty affect the uncertainty of d : the uncertainty of Q 
and the uncertainty of qo. We treat these as independent 
sources of uncertainty and combine them in quadrature. 
With regard to g , we proceed as we did with a and find 
that the uncertainty in d due to the uncertainty in Q is 
bounded by 



\\q-qo\\2 



WQ-Q' 



The standard uncertainty in d due to the standard uncer- 
tainty in Qo is Uq. Thus, the standard uncertainty of the 
separation parameter for lines is 



Ud 



Vmo^ + II q-Qo llz^ uu 



where q is now the endpoint on the test line segment 
furthest from the reference line. 

5.2.2 Circle The standard uncertainties in the 
difference parameters for circles are as follows: 



• distance between centers: u^ = \/uq + u^ 

• angle between planes (due to the plane fits): w« = Wd 

• difference in radii: u^ = Up. 

5.2.3 Plane The standard uncertainties in the dif- 
ference parameters for planes are as follows: 

• angle between planes: Ua = wd 



• plane separation: Ud = vuq + || q-q^ \2 u^ , where 
q is the point on the test plane furthest from the 
reference plane and q^ is the centroid of the data. 

5.2.4 Sphere The standard uncertainties in the 
difference parameters for spheres are as follows: 



• distance between centers: u^ = \/uq + u^ 

• difference in radii: u^ = Up. 

5.2.5 Cylinder The standard uncertainties in the 
difference parameters for cylinders are as follows: 
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angle between axes: Ua = u^ 



• axis separation: wa = vuq^ + || q-Qr Hi Wp^ where q is 
the endpoint of the test fit axis furthest from the 
reference fit axis and q^ is the point used to locate the 
reference fit axis 

• difference in radii: Wr =Wf 

5.2.6 Cone The standard uncertainties in the 
difference parameters for cones are as follows: 

• angle between axes: Ua = u^ 

• axis separation: wa = Vmq^ + H^-^Hi^F^* where q is 
the endpoint of the test fit axis furthest from the 
reference fit axis and q, is the point used to locate the 
reference fit axis 



U\ = VUq + Uy 



difference in axial location: 
difference in half-angle: Wh = Mf 



5.2.7 Torus The standard uncertainties in the 
difference parameters for tori are as follows: 



angle between planes: m« = u^ 
center separation: w, 



VUq + Uy 



• difference in major radii: Wm = Mf 

• difference in minor radii: m^ = wp 

5.3 Uncertainty of the Estimated Performance 
Parameters 

The results of Sec. 5.2 provide us with a means of 
estimating the uncertainties, expressed as standard 
deviations, of individual difference parameters pi for 
individual data sets. Using this, we can now estimate the 
uncertainty of the software performance measures for 
each geometry type. 

Recall (from Sec. 4.2) that for a particular perfor- 
mance characteristic, we are estimating the performance 
measure by the observed root-mean-square value of the 
difference parameters: 



i,-^f 



Furthermore, we have an estimate, denoted by Ui , for the 
standard deviation of each /?/ attributable to the uncer- 
tainty of the reference fit. 

Our objective is to estimate the variance of fp. To that 
end, we will use the following identity: 

y(yp) = Ea [y(fplA)] + Ya [^fplA)], 

where A is a random event, £'(yplA)and y(yplA) are, 



respectively, the expected value and variance of fp 
conditional on event A , and Ea [ ] and Ya [ ] are, respec- 
tively, the expected value and variance over all events A . 
This identity, which follows from the theorem of total 
probability, helps us estimate the variance of fp by using 
the observed sample of performance for the data sets we 
used. 

Let A be the event that the sample of observations is 
the one we indeed observed. (By conditioning on A we 
are treating the sample as fixed and not random.) We 
can interpret the two terms of the variance as follows. 
The first term, £'A[y(yplA)] represents the component 
of standard uncertainty due to the uncertainty of the 
reference fits. We will denote this quantity by u^, 
meaning "square of the standard uncertainty due to the 
reference." The second term, y4[£'(y^ lA)], represents 
the component of standard uncertainty due to the varia- 
tion in the sampling. We will denote this by ui^ meaning 
"square of the standard uncertainty due to sampling." 
We now develop estimates for the two components of 
uncertainty, starting with Mr. 

5.3.1 Uncertainty Due to the Reference To esti- 
mate Mr, we must first estimate y(yp I A ). Considering fp 
as a function of the p,, we have, using the law of propa- 
gation of uncertainty 



y(ypiA)-2 



dpi 



u^ = ^ 



nXpi' 



(If all the Pi are zero (as might happen, for example, if 
the NIST algorithms are tested against themselves), then 
this estimate is indeterminate. However, whenever pt is 
small in relation to Ut , we can improve the estimate by 
evaluating the partial derivatives in the propagation for- 
mula at the expected value of I/? J instead of at the 
expected value of/?,. To do this, we assume that pi is 
distributed uniformly about the observed value within 
± v3m/. Then, whenever I/?/ 1 < v3m/ the mean should 
be modified to include the "folding back" at zero of the 
distribution of 1/7/1. The result is 



y(yplA): 



nXpi^ 



where 



Pi = 



Pi 

D^ + 3M/^ 

2V3w,- 



if/?,' ^ w 

ifp,'<3ui^ 



This approximation to y(yplA) is used in ATEP-CMS 
as an estimate of Ea [V(yp \A )] 



Ur^ = 
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(If all the pi are zero — that is, w^ = whenever pi = — 
then we set Mr = 0.) 

The component of standard uncertainty due to the 
reference is estimated from numerical analysis consider- 
ations and, in the language of the guidelines on uncer- 
tainty [11, 10], is a Type B standard uncertainty. 

5.3.2 Uncertainty Due to Sampling Varia- 
tion We now turn to ws^, the component of standard 
uncertainty due to sampling. To estimate ws^, we need to 
estimate E(yp\A). We do this by simply using a zero-or- 
der expansion of yp about thep^, giving E(yp\A) ~ yp, 
and we are left with estimating VAiji)- If > = 0, we 
estimate Va (yp) = 0. Otherwise we proceed as follows. 
By the law of propagation of uncertainty. 



V.iy) == 4^ V.(fp^) = ^. 2V,(P.% 



For convenience, we denote by a^^ the mean variance of 
the pi^ over all samples 



o-a' = - E Va iph. 



Since % is estimated from the sample, an unbiased 
estimate for (Ta is: 



'5-A^ = ;^E0'.^-V)^ 



We then have 



Ms = 



An(n-\)%^ A(n-\)Xpi^ 



with Ms = if all of the pt = 0. 

Since the component of uncertainty due to sampling 
is estimated from a statistical analysis of data, it is a 
Type A standard uncertainty, with n-l degrees of free- 
dom. 

5.3.3 Combined Standard Uncertainty Putting 
the two terms of the variance identity together, the com- 
bined standard uncertainty of the estimated perfor- 
mance parameter > is the positive square root of the 
estimated variance of > 



V(yp) = Mr' + Ms' 



2A 



W , S(p.-' 



->y 



Following NIST convention, the expanded uncertainty 
reported by ATEP-CMS is twice the combined standard 
uncertainty (i.e., a coverage factor of ^ = 2). 



6. Summary 

This paper has described how NIST evaluates the 
performance of geometric fitting software used for in- 
spection. The NIST service, ATEP-CMS, is the only 
known test that provides quantitative measures of per- 
formance, complete with statements of uncertainty in 
accordance with international standards. 

ATEP-CMS is something new in the way of calibra- 
tion services. It is the only test of software offered by 
NIST in the field of dimensional metrology. It has the 
status of a Special Test, rather than a Calibration 
Service, because it is an experiment for us. The state- 
ments of uncertainty, in particular, are unsupported by 
any historical data. We believe, however, that they are 
fundamentally sound. We have tested some of our capa- 
bilities by running our software on both a personal 
computer and on NIST's supercomputer. These limited 
results support the validity of our approach. 

This paper has focused on the performance measures 
used in ATEP-CMS; we have not discussed testing pro- 
cedures. However, one key aspect of those procedures 
bears discussion: the selection of data sets to be used for 
the test. We use the collection of data sets as if the 
average performance of the software over those prob- 
lems represents the average performance of the software 
when it is used in production. To help ensure this, we use 
a stratified sampling approach in designing data sets for 
a test. Through judgment, we define ranges of parame- 
ters for inspection problems, including ideal geometry, 
form errors, surface sampling plans, and point measure- 
ment errors. Within each range, we select a representa- 
tive sampling of data sets. We believe that this mixed 
approach improves the quality of the test. 

We expect to refine our procedures as we gain experi- 
ence in testing. We also plan to extend the scope of our 
testing services beyond orthogonal distance regression 
to include other fitting methods and other common 
CMS software functions [23]. 

The methods described in this paper demonstrate that 
classical concepts of metrology can be used to assess 
the performance of software, when that software forms 
part of a measurement system. Moreover, it seems that 
performance testing like that offered by ATEP-CMS is 
a requirement if CMS measurements are to be traceable 
to accepted standards of length. 



nXp,^ "4(n- 1)2/7.' 
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